Skip to content

correlated condition analysis for simple guards #3185#3212

Open
jorenham wants to merge 2 commits intofacebook:mainfrom
jorenham:gh-3185
Open

correlated condition analysis for simple guards #3185#3212
jorenham wants to merge 2 commits intofacebook:mainfrom
jorenham:gh-3185

Conversation

@jorenham
Copy link
Copy Markdown
Contributor

@jorenham jorenham commented Apr 23, 2026

Summary

I had Claude Opus 4.7 take a stab at #3185, and to my (untrained) eyes it actually looks pretty decent. Since I'm not very familiar with this bit of the codebase, I opened it as a draft, with the intention of wanting to see mypy primer has to say about it, and to see if it's actually as easy

Fixes #3185

Test Plan

Added some tests, and
the failing test from 54cfb04 now passes.

@meta-cla meta-cla Bot added the cla signed label Apr 23, 2026
@jorenham jorenham changed the title correlated condition analysis for simple guards correlated condition analysis for simple guards #3185 Apr 23, 2026
@jorenham
Copy link
Copy Markdown
Contributor Author

cc @MarcoGorelli

@jorenham

This comment was marked as resolved.

@github-actions github-actions Bot added size/l and removed size/l labels Apr 23, 2026
@jorenham

This comment was marked as outdated.

@jorenham jorenham marked this pull request as ready for review April 23, 2026 17:29
@github-actions
Copy link
Copy Markdown

Diff from mypy_primer, showing the effect of this PR on open source code:

pip (https://github.com/pypa/pip)
- ERROR src/pip/_vendor/requests/utils.py:746:16-25: `old_value` may be uninitialized [unbound-name]

openlibrary (https://github.com/internetarchive/openlibrary)
- ERROR openlibrary/plugins/upstream/borrow.py:180:28-39: `ia_itemname` may be uninitialized [unbound-name]
- ERROR openlibrary/plugins/upstream/borrow.py:180:47-54: `s3_keys` may be uninitialized [unbound-name]

cloud-init (https://github.com/canonical/cloud-init)
- ERROR tests/unittests/test_util.py:614:27-35: `confd_fn` may be uninitialized [unbound-name]

egglog-python (https://github.com/egraphs-good/egglog-python)
- ERROR python/egglog/egraph.py:1244:17-24: `egraphs` may be uninitialized [unbound-name]

rotki (https://github.com/rotki/rotki)
- ERROR rotkehlchen/db/drivers/gevent.py:211:117-127: `identifier` may be uninitialized [unbound-name]
- ERROR rotkehlchen/db/drivers/gevent.py:482:98-108: `identifier` may be uninitialized [unbound-name]
- ERROR rotkehlchen/tests/api/test_database.py:55:29-45: `backup2_contents` may be uninitialized [unbound-name]
- ERROR rotkehlchen/tests/api/test_database.py:57:29-45: `backup1_contents` may be uninitialized [unbound-name]

cryptography (https://github.com/pyca/cryptography)
- ERROR src/cryptography/hazmat/primitives/serialization/ssh.py:1091:25-34: `cert_body` may be uninitialized [unbound-name]
- ERROR src/cryptography/hazmat/primitives/serialization/ssh.py:1105:13-18: `nonce` may be uninitialized [unbound-name]

scikit-learn (https://github.com/scikit-learn/scikit-learn)
- ERROR sklearn/ensemble/_gb.py:922:28-40: `y_oob_masked` may be uninitialized [unbound-name]
- ERROR sklearn/ensemble/_gb.py:924:35-59: `sample_weight_oob_masked` may be uninitialized [unbound-name]
- ERROR sklearn/ensemble/_gb.py:926:33-45: `initial_loss` may be uninitialized [unbound-name]
- ERROR sklearn/gaussian_process/_gpr.py:654:36-59: `log_likelihood_gradient` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_coordinate_descent.py:182:34-40: `X_mean` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_coordinate_descent.py:184:34-40: `X_mean` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_huber.py:66:24-33: `intercept` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_ridge.py:277:26-28: `sw` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_ridge.py:294:27-29: `sw` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_ridge.py:741:32-50: `sample_weight_sqrt` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_ridge.py:754:32-50: `sample_weight_sqrt` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_ridge.py:821:32-50: `sample_weight_sqrt` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/tests/test_base.py:834:41-50: `intercept` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/tests/test_base.py:866:41-52: `intercept_0` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/tests/test_coordinate_descent.py:1294:41-50: `intercept` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/tests/test_coordinate_descent.py:1326:41-52: `intercept_0` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/tests/test_coordinate_descent.py:1521:41-50: `intercept` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/tests/test_ridge.py:2415:41-50: `intercept` may be uninitialized [unbound-name]
- ERROR sklearn/neighbors/_base.py:382:16-26: `neigh_dist` may be uninitialized [unbound-name]
- ERROR sklearn/preprocessing/_data.py:266:19-24: `mean_` may be uninitialized [unbound-name]
- ERROR sklearn/preprocessing/_data.py:283:45-51: `scale_` may be uninitialized [unbound-name]

setuptools (https://github.com/pypa/setuptools)
- ERROR setuptools/_vendor/autocommand/autoasync.py:140:43-50: `new_sig` may be uninitialized [unbound-name]

psycopg (https://github.com/psycopg/psycopg)
- ERROR psycopg/psycopg/waiting.py:450:12-18: `wait_c` may be uninitialized [unbound-name]

mongo-python-driver (https://github.com/mongodb/mongo-python-driver)
- ERROR pymongo/asynchronous/pool.py:292:51-56: `start` may be uninitialized [unbound-name]
- ERROR pymongo/message.py:1142:39-53: `ns_doc_encoded` may be uninitialized [unbound-name]
- ERROR pymongo/synchronous/pool.py:292:51-56: `start` may be uninitialized [unbound-name]

stone (https://github.com/dropbox/stone)
- ERROR stone/backends/python_rsrc/stone_base.py:19:9-15: `typing` may be uninitialized [unbound-name]

websockets (https://github.com/aaugustin/websockets)
- ERROR src/websockets/frames.py:261:37-47: `mask_bytes` may be uninitialized [unbound-name]
- ERROR src/websockets/frames.py:327:42-52: `mask_bytes` may be uninitialized [unbound-name]
- ERROR src/websockets/legacy/framing.py:102:37-46: `mask_bits` may be uninitialized [unbound-name]

zulip (https://github.com/zulip/zulip)
- ERROR corporate/tests/test_stripe.py:731:13-23: `last_event` may be uninitialized [unbound-name]
- ERROR zerver/lib/import_realm.py:1280:17-41: `filename_to_has_original` may be uninitialized [unbound-name]
- ERROR zerver/lib/narrow.py:1323:16-28: `before_query` may be uninitialized [unbound-name]
- ERROR zerver/lib/narrow.py:1325:16-27: `after_query` may be uninitialized [unbound-name]

static-frame (https://github.com/static-frame/static-frame)
- ERROR static_frame/core/db_util.py:576:32-44: `query_create` may be uninitialized [unbound-name]
- ERROR static_frame/core/frame.py:1013:29-38: `fields_dc` may be uninitialized [unbound-name]
- ERROR static_frame/core/frame.py:6147:68-77: `drop_mask` may be uninitialized [unbound-name]
- ERROR static_frame/core/frame.py:6149:62-71: `drop_mask` may be uninitialized [unbound-name]
- ERROR static_frame/core/frame.py:9580:36-49: `columns_names` may be uninitialized [unbound-name]
- ERROR static_frame/core/type_blocks.py:175:48-57: `drop_mask` may be uninitialized [unbound-name]
- ERROR static_frame/core/type_blocks.py:180:45-54: `drop_mask` may be uninitialized [unbound-name]
- ERROR static_frame/core/type_blocks.py:271:48-57: `drop_mask` may be uninitialized [unbound-name]
- ERROR static_frame/core/type_blocks.py:276:45-54: `drop_mask` may be uninitialized [unbound-name]
- ERROR static_frame/core/util.py:1554:27-38: `is_not_none` may be uninitialized [unbound-name]
- ERROR static_frame/core/util.py:1563:20-31: `is_not_none` may be uninitialized [unbound-name]
- ERROR static_frame/core/util.py:3142:40-44: `cols` may be uninitialized [unbound-name]
- ERROR static_frame/core/util.py:3149:40-44: `cols` may be uninitialized [unbound-name]
- ERROR static_frame/core/util.py:3177:48-52: `cols` may be uninitialized [unbound-name]
- ERROR static_frame/core/util.py:3210:61-65: `cols` may be uninitialized [unbound-name]

pandas (https://github.com/pandas-dev/pandas)
- ERROR pandas/core/common.py:603:32-41: `old_value` may be uninitialized [unbound-name]
- ERROR pandas/tests/reshape/test_get_dummies.py:112:72-82: `fill_value` may be uninitialized [unbound-name]

mypy (https://github.com/python/mypy)
- ERROR mypy/test/testpep561.py:159:23-30: `program` may be uninitialized [unbound-name]
- ERROR mypyc/codegen/emitclass.py:419:64-82: `shadow_vtable_name` may be uninitialized [unbound-name]
- ERROR mypyc/irbuild/ll_builder.py:1093:28-32: `skip` may be uninitialized [unbound-name]
- ERROR mypyc/irbuild/ll_builder.py:2787:37-48: `false_block` may be uninitialized [unbound-name]

meson (https://github.com/mesonbuild/meson)
- ERROR mesonbuild/backend/ninjabackend.py:717:20-52: `captured_compile_args_per_target` may be uninitialized [unbound-name]
- ERROR mesonbuild/modules/_qt.py:760:38-45: `results` may be uninitialized [unbound-name]
- ERROR unittests/allplatformstests.py:5462:27-29: `cc` may be uninitialized [unbound-name]

core (https://github.com/home-assistant/core)
- ERROR homeassistant/components/bayesian/config_flow.py:557:25-34: `sub_entry` may be uninitialized [unbound-name]
- ERROR homeassistant/components/bayesian/config_flow.py:572:70-79: `sub_entry` may be uninitialized [unbound-name]
- ERROR homeassistant/components/opnsense/__init__.py:118:50-65: `interfaces_resp` may be uninitialized [unbound-name]
- ERROR homeassistant/components/recorder/util.py:126:49-60: `timer_start` may be uninitialized [unbound-name]
- ERROR homeassistant/components/tplink/config_flow.py:459:27-29: `un` may be uninitialized [unbound-name]
- ERROR homeassistant/components/tplink/config_flow.py:459:34-36: `pw` may be uninitialized [unbound-name]
- ERROR homeassistant/components/tplink/entity.py:678:45-64: `has_parent_entities` may be uninitialized [unbound-name]
- ERROR homeassistant/helpers/entity.py:1375:17-28: `update_warn` may be uninitialized [unbound-name]

paasta (https://github.com/yelp/paasta)
- ERROR paasta_tools/utils.py:2893:25-32: `service` may be uninitialized [unbound-name]
- ERROR paasta_tools/utils.py:2895:27-36: `component` may be uninitialized [unbound-name]
- ERROR paasta_tools/utils.py:2896:23-31: `loglevel` may be uninitialized [unbound-name]
- ERROR paasta_tools/utils.py:2897:25-32: `cluster` may be uninitialized [unbound-name]
- ERROR paasta_tools/utils.py:2898:26-34: `instance` may be uninitialized [unbound-name]
- ERROR paasta_tools/utils.py:2910:13-22: `proctimer` may be uninitialized [unbound-name]

spack (https://github.com/spack/spack)
- ERROR lib/spack/spack/config.py:728:34-42: `comments` may be uninitialized [unbound-name]

kornia (https://github.com/kornia/kornia)
- ERROR kornia/feature/disk/disk.py:117:39-40: `h` may be uninitialized [unbound-name]
- ERROR kornia/feature/disk/disk.py:117:43-44: `w` may be uninitialized [unbound-name]

spark (https://github.com/apache/spark)
- ERROR python/pyspark/pandas/groupby.py:350:56-61: `order` may be uninitialized [unbound-name]
- ERROR python/pyspark/pandas/groupby.py:351:75-82: `columns` may be uninitialized [unbound-name]
- ERROR python/pyspark/pandas/indexes/base.py:1597:57-69: `sequence_col` may be uninitialized [unbound-name]
- ERROR python/pyspark/pandas/tests/indexes/test_indexing_loc.py:263:32-38: `psser1` may be uninitialized [unbound-name]
- ERROR python/pyspark/pandas/tests/indexes/test_indexing_loc.py:263:40-45: `pser1` may be uninitialized [unbound-name]
- ERROR python/pyspark/pandas/tests/indexes/test_indexing_loc.py:264:32-38: `psser2` may be uninitialized [unbound-name]
- ERROR python/pyspark/pandas/tests/indexes/test_indexing_loc.py:264:40-45: `pser2` may be uninitialized [unbound-name]
- ERROR python/pyspark/pandas/tests/indexes/test_indexing_loc.py:332:32-38: `psser1` may be uninitialized [unbound-name]
- ERROR python/pyspark/pandas/tests/indexes/test_indexing_loc.py:332:40-45: `pser1` may be uninitialized [unbound-name]
- ERROR python/pyspark/pandas/tests/indexes/test_indexing_loc.py:333:32-38: `psser2` may be uninitialized [unbound-name]
- ERROR python/pyspark/pandas/tests/indexes/test_indexing_loc.py:333:40-45: `pser2` may be uninitialized [unbound-name]

prefect (https://github.com/PrefectHQ/prefect)
- ERROR src/prefect/cli/deployment.py:750:48-70: `parsed_interval_anchor` may be uninitialized [unbound-name]
- ERROR src/prefect/server/api/events.py:154:21-35: `backfilled_ids` may be uninitialized [unbound-name]

@github-actions
Copy link
Copy Markdown

Primer Diff Classification

✅ 23 improvement(s) | 23 project(s) total | -99 errors

23 improvement(s) across pip, openlibrary, cloud-init, egglog-python, rotki, cryptography, scikit-learn, setuptools, psycopg, mongo-python-driver, stone, websockets, zulip, static-frame, pandas, mypy, meson, core, paasta, spack, kornia, spark, prefect.

Project Verdict Changes Error Kinds Root Cause
pip ✅ Improvement -1 unbound-name upgrade_to_guarded()
openlibrary ✅ Improvement -2 unbound-name upgrade_to_guarded()
cloud-init ✅ Improvement -1 unbound-name upgrade_to_guarded()
egglog-python ✅ Improvement -1 unbound-name upgrade_to_guarded()
rotki ✅ Improvement -4 unbound-name upgrade_to_guarded()
cryptography ✅ Improvement -2 unbound-name upgrade_to_guarded()
scikit-learn ✅ Improvement -21 unbound-name upgrade_to_guarded()
setuptools ✅ Improvement -1 unbound-name upgrade_to_guarded()
psycopg ✅ Improvement -1 unbound-name upgrade_to_guarded()
mongo-python-driver ✅ Improvement -3 unbound-name upgrade_to_guarded()
stone ✅ Improvement -1 unbound-name upgrade_to_guarded()
websockets ✅ Improvement -3 unbound-name upgrade_to_guarded()
zulip ✅ Improvement -4 unbound-name upgrade_to_guarded()
static-frame ✅ Improvement -15 unbound-name upgrade_to_guarded()
pandas ✅ Improvement -2 Correlated condition false positives removed upgrade_to_guarded()
mypy ✅ Improvement -4 unbound-name upgrade_to_guarded()
meson ✅ Improvement -3 unbound-name upgrade_to_guarded()
core ✅ Improvement -8 unbound-name false positives for correlated conditions upgrade_to_guarded()
paasta ✅ Improvement -6 unbound-name upgrade_to_guarded()
spack ✅ Improvement -1 unbound-name upgrade_to_guarded()
kornia ✅ Improvement -2 unbound-name upgrade_to_guarded()
spark ✅ Improvement -11 Correlated condition false positives removed upgrade_to_guarded()
prefect ✅ Improvement -2 unbound-name upgrade_to_guarded()
Detailed analysis

✅ Improvement (23)

pip (-1)

This is a clear improvement. The removed error was a false positive — old_value is always initialized when value_changed is truthy, and the same value_changed guard controls both the assignment (line 740) and the use (line 746). The PR implements correlated condition analysis that correctly recognizes this pattern, as demonstrated by the test cases (test_guarded_initialization_basic, etc.) that were previously marked as bugs and are now passing.
Attribution: The PR adds correlated condition analysis in pyrefly/lib/binding/stmt.rs and pyrefly/lib/binding/scope.rs. Specifically, for a bare if <name>: with no elif/else, the code now records the guard via upgrade_to_guarded() (new method in scope.rs), creating a FlowStyle::InitializedIfGuardTruthy state. When a later if <name>: narrows the same guard (detected in bindings.rs via the clear_matching_truthy_guard call), the InitializedIfGuardTruthy style is upgraded to FlowStyle::Other, marking the variable as initialized. This directly fixes the false positive for old_value in pip's set_environ function, where value_changed guards both the assignment and the use.

openlibrary (-2)

The removed errors were false positives caused by pyrefly's inability to recognize correlated conditions. The variables ia_itemname and s3_keys are only assigned inside the if user: block (lines 175-179). At line 180, the condition if not user or not ia_itemname or not s3_keys: uses short-circuit evaluation: if not user is True, Python short-circuits the or and never evaluates ia_itemname or s3_keys, so they don't need to be defined. If not user is False (i.e., user is truthy), then the if user: block on line 175 was entered, meaning ia_itemname and s3_keys were assigned. Therefore, ia_itemname and s3_keys are always defined before they are accessed at line 180. The PR implements correlated-condition analysis that correctly handles this pattern, eliminating the false positives.
Attribution: The PR adds correlated-condition analysis for simple guards. Specifically:

  1. In pyrefly/lib/binding/stmt.rs, when processing if <name>: with no elif/else, it captures the guard info and after the merge calls upgrade_to_guarded() to mark newly-defined names as InitializedIfGuardTruthy.
  2. In pyrefly/lib/binding/scope.rs, the new FlowStyle::InitializedIfGuardTruthy variant tracks that a name was defined only in the truthy branch of a guard. The clear_matching_truthy_guard() method in scope.rs upgrades these to FlowStyle::Other (fully initialized) when the same guard is later checked for truthiness.
  3. In pyrefly/lib/binding/bindings.rs, when a NarrowOp::Atomic(None, AtomicNarrowOp::IsTruthy) narrowing is applied, it calls clear_matching_truthy_guard() to resolve the correlated condition.

This directly fixes the false positives: user guards the definition of ia_itemname/s3_keys on line 175, and the not user check on line 180 is recognized as the same guard, so pyrefly now knows those variables are initialized when accessed.

cloud-init (-1)

This is a clear improvement. The PR implements correlated-condition analysis that correctly recognizes that when the same boolean variable guards both a variable's definition and its use, the variable is guaranteed to be initialized at the point of use. The removed error on confd_fn at line 614 was a false positive — confd_fn is assigned at line 605 inside if create_confd: and used at line 614 also inside if create_confd:, so it's always initialized when accessed.
Attribution: The PR adds correlated-condition analysis for simple guards. Specifically, the changes in pyrefly/lib/binding/stmt.rs detect if <name>: patterns with no elif/else and record the guard. The new upgrade_to_guarded() method in pyrefly/lib/binding/scope.rs upgrades PossiblyUninitialized flow styles to InitializedIfGuardTruthy for names defined only in the truthy branch. Then clear_matching_truthy_guard() in the same file clears these guards when the same name is later narrowed via if <name>:, effectively marking the variable as initialized within that block. The new FlowStyle::InitializedIfGuardTruthy variant and its handling in FlowInfo::[is_initialized()](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/binding/stmt.rs) (returning DeferredCheck) complete the analysis. This directly fixes the false positive for confd_fn in cloud-init's test, where create_confd guards both the assignment and the use.

egglog-python (-1)

This is a clear improvement. The variable egraphs is always initialized when visualize is truthy, and the use site is guarded by the same visualize condition. The PR correctly implements correlated-condition analysis to recognize this pattern and eliminate the false positive.
Attribution: The PR adds correlated-condition analysis in pyrefly/lib/binding/stmt.rs (the truthy_guard logic in if-statement handling) and pyrefly/lib/binding/scope.rs (the new FlowStyle::InitializedIfGuardTruthy variant, upgrade_to_guarded(), and clear_matching_truthy_guard() methods). When processing if visualize: at line 1229, the system now records that visualize is the guard. After the merge, names defined only in the truthy branch (like egraphs) get InitializedIfGuardTruthy style. When a later if visualize: is encountered (line 1243), clear_matching_truthy_guard() in pyrefly/lib/binding/bindings.rs upgrades the flow style to Other, suppressing the false unbound-name error.

rotki (-4)

All four removed errors are false positives caused by correlated conditions. The __debug__ guard is a compile-time constant — if identifier = random.random() executes inside if __debug__:, then any subsequent if __debug__: block will also execute, so identifier is guaranteed to be bound. The project authors already knew these were false positives (evidenced by # pyright: ignore # if debug identifier is set comments). The PR's correlated-condition analysis correctly recognizes this pattern and suppresses the spurious unbound-name warnings.
Attribution: The PR adds correlated-condition analysis in pyrefly/lib/binding/stmt.rs (the truthy_guard logic in if-statement handling), pyrefly/lib/binding/scope.rs (new FlowStyle::InitializedIfGuardTruthy variant, upgrade_to_guarded(), and clear_matching_truthy_guard() methods), and pyrefly/lib/binding/bindings.rs (calling clear_matching_truthy_guard() when a bare if <name>: narrowing is applied). This new flow style tracks that a variable was defined only in the truthy branch of if <guard>:, and when a later if <guard>: is encountered with the same guard value, the variable is treated as initialized (FlowStyle::Other) instead of PossiblyUninitialized.

cryptography (-2)

Both removed errors were false positives from pyrefly's inability to correlate conditions. The variable with_cert guards both the definition (cert_body = rest on line 1060, nonce, rest = _get_sshstr(rest) on line 1065) and the use (the entire block starting at line 1067's if with_cert:). Since with_cert is never reassigned between these points, the variables are always initialized when accessed. The PR's correlated-condition analysis correctly recognizes this pattern and suppresses the spurious unbound-name errors.
Attribution: The PR adds correlated-condition analysis for simple guards. Specifically, the new FlowStyle::InitializedIfGuardTruthy variant in pyrefly/lib/binding/scope.rs tracks that a variable was defined only inside if <guard>: (no else). The upgrade_to_guarded() method in scope.rs upgrades PossiblyUninitialized flow styles to InitializedIfGuardTruthy after merging an if-without-else. Then clear_matching_truthy_guard() in scope.rs clears these guards when the same guard name is narrowed via IsTruthy in bindings.rs. The if-statement handling in pyrefly/lib/binding/stmt.rs captures the guard info before the fork and applies upgrade_to_guarded() after the merge. This allows pyrefly to recognize that cert_body and nonce are always initialized when with_cert is truthy.

scikit-learn (-21)

All 21 removed unbound-name errors were false positives caused by pyrefly's inability to recognize correlated conditions. The classic pattern is: if cond: x = val followed by if cond: use(x) — since cond hasn't changed, x is always initialized when used. The PR implements correlated-condition analysis that correctly handles this pattern, eliminating these false positives across 10 sklearn files.
Attribution: The PR adds correlated-condition analysis for simple guards. Specifically: (1) In pyrefly/lib/binding/stmt.rs, when processing if <name>: with no elif/else, it captures the guard name and its current flow index, then after the merge calls upgrade_to_guarded(). (2) In pyrefly/lib/binding/scope.rs, the new FlowStyle::InitializedIfGuardTruthy variant tracks that a variable was defined only in the truthy branch of a specific guard. The upgrade_to_guarded() method converts PossiblyUninitialized/MaybeInitialized to this new variant. The clear_matching_truthy_guard() method (called from pyrefly/lib/binding/bindings.rs when a IsTruthy narrow is applied) marks matching entries as FlowStyle::Other (initialized), suppressing the false positive.

setuptools (-1)

This is a clear false positive removal. The pattern if X: define_var followed by if X: use_var (where X is not reassigned) guarantees the variable is initialized at the use site. The PR implements correlated-condition analysis that recognizes this pattern, correctly eliminating the spurious unbound-name error on new_sig at line 140.
Attribution: The PR adds correlated-condition analysis for simple guards. Specifically, the new FlowStyle::InitializedIfGuardTruthy variant in pyrefly/lib/binding/scope.rs tracks variables that are only defined inside if <guard>: blocks. The upgrade_to_guarded() method in scope.rs marks such variables after the if-statement merge. Then clear_matching_truthy_guard() in scope.rs (called from narrow_in_current_flow in pyrefly/lib/binding/bindings.rs) clears the guard when the same name is narrowed via IsTruthy, treating the variable as definitely initialized. The detection of the guard pattern happens in pyrefly/lib/binding/stmt.rs where truthy_guard is computed for bare if <name>: statements with no elif/else clauses. In the setuptools case, pass_loop is the guard — new_sig is defined inside if pass_loop: and used inside another if pass_loop:, so the correlated-condition analysis correctly determines new_sig is initialized.

psycopg (-1)

This is a clear false positive removal. The variable wait_c is assigned on line 426 inside if _psycopg:, and used on line 450 inside elif _psycopg and ...:. Since the elif requires _psycopg to be truthy, and wait_c is always assigned when _psycopg is truthy, the variable is guaranteed to be initialized at the point of use. The PR implements correlated-condition analysis that correctly recognizes this pattern, eliminating the false positive.
Attribution: The PR adds correlated-condition analysis for simple guards. Specifically, the new FlowStyle::InitializedIfGuardTruthy variant in pyrefly/lib/binding/scope.rs tracks that a variable was defined only inside an if <guard>: block. The upgrade_to_guarded() method in pyrefly/lib/binding/scope.rs upgrades PossiblyUninitialized flow styles to InitializedIfGuardTruthy after merging an if-statement with no else clause. Then clear_matching_truthy_guard() in pyrefly/lib/binding/scope.rs clears the guard when the same name is later narrowed via IsTruthy (detected in pyrefly/lib/binding/bindings.rs). The logic in pyrefly/lib/binding/stmt.rs captures the guard information before the fork and applies upgrade_to_guarded after the merge. This allows pyrefly to recognize that wait_c (assigned under if _psycopg:) is definitely initialized when later used under elif _psycopg ...:.

mongo-python-driver (-3)

All three removed errors are false positives being fixed. The pattern is: a variable is assigned inside if X: and later used inside another if X: where X hasn't been reassigned. The variable is guaranteed to be initialized at the use site. Pyrefly's new correlated condition analysis correctly recognizes this pattern and suppresses the spurious unbound-name warnings. This is a clear improvement in pyrefly's analysis precision.
Attribution: The PR adds correlated condition analysis for simple guards. Specifically: (1) pyrefly/lib/binding/scope.rs introduces a new FlowStyle::InitializedIfGuardTruthy variant that tracks variables defined only in the truthy branch of if <guard>: with no else clause, along with methods upgrade_to_guarded() and clear_matching_truthy_guard(). (2) pyrefly/lib/binding/stmt.rs detects bare if <name>: patterns with no elif/else and calls upgrade_to_guarded() after the fork merge. (3) pyrefly/lib/binding/bindings.rs calls clear_matching_truthy_guard() when a subsequent if <name>: narrows the same guard, upgrading the flow style to Other (initialized). This correctly eliminates the false positive unbound-name errors for the correlated condition pattern.

stone (-1)

This is a clear improvement. The pattern _MYPY = False; if _MYPY: import typing; ...; if _MYPY: T = typing.TypeVar(...) is a standard Python idiom (equivalent to TYPE_CHECKING). The variable typing is always initialized when it's used because both the definition and use are guarded by the same condition. The old error was a false positive that the PR's correlated condition analysis correctly eliminates. The test cases in the PR diff (test_guarded_initialization_basic, test_guarded_initialization_multiple_variables, etc.) confirm this is the intended fix.
Attribution: The PR introduces correlated condition analysis in pyrefly/lib/binding/stmt.rs (the truthy_guard logic around line 994-1007) and pyrefly/lib/binding/scope.rs (the new FlowStyle::InitializedIfGuardTruthy variant and the upgrade_to_guarded() / clear_matching_truthy_guard() methods). When pyrefly encounters if _MYPY: with no else clause, it now records the guard. After the fork merges, upgrade_to_guarded() marks newly-defined names (like typing) as InitializedIfGuardTruthy. When a subsequent if _MYPY: is encountered, clear_matching_truthy_guard() in pyrefly/lib/binding/bindings.rs (around line 1751-1754) recognizes the IsTruthy narrow on _MYPY and clears the guard, treating typing as fully initialized.

websockets (-3)

All three removed errors are false positives. The pattern if mask: x = ...; if mask: use(x) guarantees x is initialized at the use site. Pyrefly's new correlated-condition analysis correctly recognizes that when the same boolean variable guards both the assignment and the use (without being reassigned in between), the variable is always initialized in the use context. This is a clear improvement in pyrefly's control flow analysis.
Attribution: The PR adds a new FlowStyle::InitializedIfGuardTruthy variant in pyrefly/lib/binding/scope.rs and correlated-condition analysis logic in pyrefly/lib/binding/stmt.rs (around the if statement handling) and pyrefly/lib/binding/bindings.rs (in the narrowing logic). Specifically:

  • stmt.rs: When processing if <name>: with no elif/else, it captures the guard info and after the fork merge calls upgrade_to_guarded() to mark newly-defined names as InitializedIfGuardTruthy.
  • scope.rs: clear_matching_truthy_guard() is called when a later if <name>: narrows the same guard, upgrading the flow style to Other (fully initialized).
  • bindings.rs: The IsTruthy narrow op triggers clear_matching_truthy_guard().

This directly fixes the false positive where mask_bytes/mask_bits was reported as potentially uninitialized despite being guarded by the same condition.

zulip (-4)

All four removed errors are false positives caused by pyrefly's previously incomplete flow analysis. The PR adds correlated-condition analysis: when a variable is defined only inside if guard: (no else), and later used inside another if guard: (where guard hasn't been reassigned), pyrefly now correctly recognizes the variable as initialized. The test cases in the PR (e.g., test_guarded_initialization_basic removing its bug annotation) confirm this was a known false positive being fixed.
Attribution: The changes in pyrefly/lib/binding/scope.rs (adding FlowStyle::InitializedIfGuardTruthy variant and upgrade_to_guarded()/clear_matching_truthy_guard() methods), pyrefly/lib/binding/stmt.rs (detecting bare if <name>: patterns and applying upgrade_to_guarded after fork merge), and pyrefly/lib/binding/bindings.rs (calling clear_matching_truthy_guard when a truthy narrow is applied) together implement the correlated-condition analysis that eliminates these false positives.

static-frame (-15)

This is a clear improvement. The PR implements correlated condition analysis — when a variable is assigned inside if x: and later used inside another if x: (where x hasn't been reassigned), pyrefly now correctly recognizes that the variable is guaranteed to be initialized. All 15 removed unbound-name errors were false positives following this exact pattern. The test cases in the PR diff confirm this intent, with previously-marked bug = "false positive" annotations being removed as the bugs are fixed.
Attribution: The PR introduces a new FlowStyle::InitializedIfGuardTruthy variant in pyrefly/lib/binding/scope.rs that tracks when a variable was defined only on the truthy branch of if <guard>: (with no else). The upgrade_to_guarded() method in pyrefly/lib/binding/scope.rs upgrades PossiblyUninitialized flow styles to InitializedIfGuardTruthy after merging an if-statement with no else. Then clear_matching_truthy_guard() in the same file clears these guards when the same name is narrowed via IsTruthy in pyrefly/lib/binding/bindings.rs. The detection of the guard pattern happens in pyrefly/lib/binding/stmt.rs where truthy_guard is computed for bare if <name>: statements with no elif/else clauses.

pandas (-2)

Correlated condition false positives removed: Both old_value in temp_setattr and fill_value in test_get_dummies_basic_types are defined and used under the same boolean guard (condition and sparse respectively). The PR's new InitializedIfGuardTruthy flow style correctly recognizes these patterns and suppresses the spurious unbound-name errors.

Overall: Both removed errors are false positives from pyrefly's inability to track correlated conditions. The PR implements correlated-condition analysis that recognizes when a variable is defined inside if X: and later used inside another if X: (where X hasn't been reassigned), the variable is guaranteed to be initialized. This is a well-known pattern that other type checkers handle correctly.

Per-category reasoning:

  • Correlated condition false positives removed: Both old_value in temp_setattr and fill_value in test_get_dummies_basic_types are defined and used under the same boolean guard (condition and sparse respectively). The PR's new InitializedIfGuardTruthy flow style correctly recognizes these patterns and suppresses the spurious unbound-name errors.

For old_value in temp_setattr (line 603): old_value is assigned on line 597 inside if condition:, and used on line 603 inside another if condition: block. Since condition is a parameter that isn't reassigned between these two checks, if the second if condition: is true, the first one must have been true too, guaranteeing old_value was initialized.

For fill_value in test_get_dummies_basic_types (line 112): fill_value is assigned on lines 97-101 inside if sparse: (with branches for different dtype conditions), and used on line 112 inside another if sparse: block. Since sparse is a fixture parameter that isn't reassigned between these two checks, if the second if sparse: is true, the first one must have been true too, guaranteeing fill_value was initialized.

Attribution: The PR adds correlated-condition analysis for simple if <name>: guards. Specifically: (1) A new FlowStyle::InitializedIfGuardTruthy variant in pyrefly/lib/binding/scope.rs tracks variables defined only in the truthy branch of an if <guard>: with no else. (2) In pyrefly/lib/binding/stmt.rs, after processing an if statement with no elif/else, upgrade_to_guarded() is called to mark newly-defined variables with this flow style. (3) In pyrefly/lib/binding/bindings.rs, when a subsequent if <name>: narrows the same guard variable, clear_matching_truthy_guard() upgrades the flow style to Other (initialized), suppressing the false positive. This directly fixes both pandas errors where variables were defined and used under the same boolean guard.

mypy (-4)

All four removed errors are false positives from correlated-condition patterns. The PR implements a well-designed analysis that tracks when a variable is initialized inside if <guard>: and later used inside another if <guard>: block with the same guard. Since the guard hasn't been reassigned between the two checks, the variable is guaranteed to be initialized at the point of use. Removing these false positives is a clear improvement in pyrefly's analysis precision.
Attribution: The PR adds correlated-condition analysis for simple guards. Specifically:

  • pyrefly/lib/binding/stmt.rs: In the if statement handling, it detects bare if <name>: with no elif/else, captures the guard name and pre-fork flow state, then after the merge calls upgrade_to_guarded() to mark newly-defined names as InitializedIfGuardTruthy.
  • pyrefly/lib/binding/scope.rs: Adds the new FlowStyle::InitializedIfGuardTruthy variant, implements upgrade_to_guarded() to convert PossiblyUninitialized/MaybeInitialized to the new guarded style, and clear_matching_truthy_guard() to resolve the guard when a later if <name>: narrows the same guard.
  • pyrefly/lib/binding/bindings.rs: In the narrowing logic, when a bare if <name>: truthiness narrow is applied, calls clear_matching_truthy_guard() to mark the guarded variables as definitely initialized.

This directly eliminates the false positive unbound-name errors for variables that are initialized under the same guard condition that later protects their use.

meson (-3)

All three removed errors are false positives that pyrefly previously reported because it couldn't track correlated conditions. The pattern is: a variable is assigned inside if guard: (with no else), then later used inside another if guard: block. Since the same condition controls both the definition and the use, the variable is always initialized when accessed. The PR implements correlated condition analysis to handle this common pattern, correctly eliminating these false positives.
Attribution: The changes in pyrefly/lib/binding/scope.rs introduce a new FlowStyle::InitializedIfGuardTruthy variant that tracks when a variable was defined only in the truthy branch of if <guard>: with no else clause. The upgrade_to_guarded() method in scope.rs upgrades PossiblyUninitialized to this new style after merging an if-without-else. The clear_matching_truthy_guard() method in scope.rs clears the guard when the same name is later narrowed with IsTruthy. The changes in pyrefly/lib/binding/stmt.rs detect the if <name>: pattern (no elif/else), snapshot pre-fork names, and call upgrade_to_guarded after the merge. The changes in pyrefly/lib/binding/bindings.rs call clear_matching_truthy_guard when a truthy narrowing occurs. Together, these changes enable pyrefly to recognize that variables defined under if x: are always initialized when later accessed under the same if x: guard.

core (-8)

unbound-name false positives for correlated conditions: All 8 removed errors follow the pattern: variable assigned inside if cond: and used inside a later if cond: or in a conditional expression guarded by the same condition (e.g., sub_entry assigned under if reconfiguring: and used in sub_entry if reconfiguring else None). The guard variable is not reassigned between definition and use, so the variable is guaranteed to be initialized at the point of use. These were false positives that pyrefly now correctly handles via the new correlated condition flow analysis.

Overall: This is a clear improvement. The PR implements correlated condition analysis — a well-known flow analysis technique where variables defined under if guard: are recognized as initialized when accessed under a subsequent if guard: with the same unchanged guard. Both removed unbound-name errors in this file were false positives because sub_entry is assigned inside if reconfiguring: (line 543) and then used at line 557 inside another if reconfiguring: block, and at line 572 in the expression _get_observation_values_for_editing(sub_entry) if reconfiguring else None. In both cases, sub_entry is guaranteed to be initialized at the point of use because the same reconfiguring guard (a parameter that is never reassigned) protects both the definition and the use. The remaining 6 errors across other files follow analogous patterns. The test cases in the PR diff confirm this intent, removing bug = "false positive" annotations from multiple test cases.

Attribution: The PR adds a new FlowStyle::InitializedIfGuardTruthy variant in pyrefly/lib/binding/scope.rs and correlated-condition analysis in pyrefly/lib/binding/stmt.rs. Specifically, in stmt.rs, when processing an if statement with no elif/else clauses and a bare name test (if <name>:), it captures the guard info before the fork. After the merge in finish_non_exhaustive_fork, upgrade_to_guarded() in scope.rs upgrades PossiblyUninitialized flow styles to InitializedIfGuardTruthy. Then in bindings.rs, when a subsequent if <name>: narrowing occurs, clear_matching_truthy_guard() promotes the guarded names to FlowStyle::Other (fully initialized), eliminating the false positive.

paasta (-6)

All 6 removed errors are false positives for the same pattern in _run(). The function has if log: on line 2843 that assigns service, component, cluster, instance, loglevel, and then if log: on line 2891 that uses those same variables. Since the same boolean log guards both blocks, the variables are always initialized when used. The PR's correlated-condition analysis correctly recognizes this pattern and suppresses the spurious unbound-name errors.
Attribution: The changes span multiple files implementing correlated-condition analysis:

  1. pyrefly/lib/binding/scope.rs: Added FlowStyle::InitializedIfGuardTruthy variant, upgrade_to_guarded() method, clear_matching_truthy_guard() method, and current_flow_names() method. The InitializedIfGuardTruthy flow style tracks that a variable was defined only in the truthy branch of an if <guard>: and can be considered initialized inside a later if <guard>: block.
  2. pyrefly/lib/binding/stmt.rs: In the if-statement handling, captures the guard name and pre-fork names before forking, then calls upgrade_to_guarded() after the merge to mark newly-defined names with InitializedIfGuardTruthy.
  3. pyrefly/lib/binding/bindings.rs: In narrowing logic, when a bare if <name>: is encountered, calls clear_matching_truthy_guard() to upgrade InitializedIfGuardTruthy entries to FlowStyle::Other (fully initialized), eliminating the false positive.

spack (-1)

This is a clear false positive removal. The variable comments is defined inside if need_comment_copy: and used inside if need_comment_copy and comments:. The guard need_comment_copy is not reassigned between the two checks, so comments is guaranteed to be initialized whenever it is accessed. The PR's correlated-condition analysis correctly recognizes this pattern.
Attribution: The change to stmt.rs in pyrefly/lib/binding/stmt.rs detects if <name>: patterns with no elif/else and records the guard. After the fork merge, upgrade_to_guarded() in pyrefly/lib/binding/scope.rs marks newly-defined names as InitializedIfGuardTruthy. Then in bindings.rs, clear_matching_truthy_guard() clears the guard when the same name is narrowed via IsTruthy, promoting the flow style to Other (initialized). This directly fixes the false positive for comments guarded by need_comment_copy.

kornia (-2)

Both removed errors were false positives. The pattern if guard: x = val followed by if guard: use(x) guarantees x is initialized when used, because the same condition controls both paths. Pyrefly's new correlated-condition analysis correctly recognizes this pattern and no longer reports h and w as potentially uninitialized. This matches the behavior of mypy and pyright, which handle this pattern correctly.
Attribution: The PR adds correlated-condition analysis for simple guards. Specifically, the new FlowStyle::InitializedIfGuardTruthy variant in pyrefly/lib/binding/scope.rs tracks variables that are only initialized inside a truthy guard branch. The upgrade_to_guarded() method in scope.rs upgrades PossiblyUninitialized flow styles to InitializedIfGuardTruthy after merging an if <name>: block with no else clause. Then clear_matching_truthy_guard() in scope.rs (called from bindings.rs when processing narrowing for IsTruthy) marks those variables as fully initialized when the same guard is checked again. The trigger in stmt.rs captures the guard name and pre-fork names before the fork, then calls upgrade_to_guarded after the merge. This directly fixes the false positive for the if pad_if_not_divisible: h, w = ... ; if pad_if_not_divisible: use(h, w) pattern.

spark (-11)

Correlated condition false positives removed: All 11 removed errors follow the pattern: variable assigned in if <guard>: (no else), used in later if <guard>: where guard is unchanged. The new correlated-condition analysis correctly recognizes these variables are always initialized at point of use.

Overall: This is a clear improvement. The PR implements correlated-condition analysis that correctly recognizes that when a variable is assigned inside if x: and later used inside another if x: (where x hasn't been reassigned), the variable is guaranteed to be initialized. All 11 removed unbound-name errors were false positives - the variables (order, columns, etc.) were always initialized at their point of use because the same boolean guard controlled both definition and use. The test cases in the PR diff confirm this intent, with previously-marked bugs now passing correctly.

Attribution: The PR adds correlated-condition analysis for simple guards. The key changes are:

  1. pyrefly/lib/binding/scope.rs: New FlowStyle::InitializedIfGuardTruthy variant that tracks when a variable was defined only in the truthy branch of if <guard>: with no else clause. New methods upgrade_to_guarded() and clear_matching_truthy_guard() manage this state.
  2. pyrefly/lib/binding/stmt.rs: In the If statement handling, after processing a bare if <name>: with no elif/else, calls upgrade_to_guarded() to mark newly-defined names with the guard information.
  3. pyrefly/lib/binding/bindings.rs: When narrowing via IsTruthy, calls clear_matching_truthy_guard() to mark guarded variables as definitely initialized inside the matching if <guard>: block.
  4. pyrefly/lib/binding/scope.rs InitializedInFlow: The new InitializedIfGuardTruthy variant returns Conditionally when termination_keys is empty, or DeferredCheck otherwise, properly integrating with the existing initialization checking.

prefect (-2)

Both removed errors are false positives from pyrefly's previously incomplete correlated-condition analysis. The PR implements tracking of the pattern where if guard: defines a variable and a subsequent if guard: uses it — since the guard hasn't been reassigned, the variable is guaranteed to be initialized. This is a well-known flow analysis improvement that mypy and pyright already handle. The test cases in the PR diff confirm this was a known bug (the bug = "false positive" annotations are removed from the test cases).

For deployment.py:750: The variable parsed_interval_anchor is assigned inside if interval_anchor: (line 732-733), and then used inside another if interval_anchor: (line 749-750). Since interval_anchor is a function parameter that is not reassigned between the two checks, the second check guarantees the variable was initialized by the first check. This is a classic correlated-condition pattern.

For events.py:154: The variable backfilled_ids is assigned inside if wants_backfill: (line 118-119), and then used inside if wants_backfill and event.id in backfilled_ids: (line 153-154). Since wants_backfill is not reassigned between the two checks, the condition wants_backfill being true at line 153 guarantees that the if wants_backfill: block at line 118 was executed, meaning backfilled_ids was initialized. This is the same correlated-condition pattern.

Attribution: The PR adds a new FlowStyle::InitializedIfGuardTruthy variant in pyrefly/lib/binding/scope.rs that tracks when a variable is defined only under a truthy guard (if <name>:). The key changes are:

  1. In pyrefly/lib/binding/stmt.rs, when processing an if statement with no elif/else clauses and a bare name test, it captures the guard info (truthy_guard) before forking, then after the merge calls upgrade_to_guarded() to mark newly-defined names as InitializedIfGuardTruthy.

  2. In pyrefly/lib/binding/scope.rs, upgrade_to_guarded() converts PossiblyUninitialized/MaybeInitialized flow styles to InitializedIfGuardTruthy for names introduced in the if-body.

  3. In pyrefly/lib/binding/bindings.rs, when a NarrowOp::Atomic(None, AtomicNarrowOp::IsTruthy) narrowing is applied (i.e., if <name>:), clear_matching_truthy_guard() is called, which upgrades matching InitializedIfGuardTruthy entries to FlowStyle::Other (meaning definitely initialized).

  4. The InitializedInFlow check in FlowInfo::[is_initialized()](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/binding/scope.rs) handles the new variant: if termination_keys is empty, it's Conditionally initialized; otherwise it defers to the termination key check (preserving NoReturn fallback behavior).


Was this helpful? React with 👍 or 👎

Classification by primer-classifier (23 LLM)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Could unbound-name be made less noisy?

2 participants